Corpus-based Dialectometry: Aggregate Morphosyntactic Variability in British English Dialects

نویسنده

  • Benedikt Szmrecsanyi
چکیده

The research reported in this paper departs from most previous work in dialectometry in several ways. Empirically, it draws on frequency vectors derived from naturalistic corpus data and not on discrete atlas classifications. Linguistically, it is concerned with morphosyntactic (as opposed to lexical or pronunciational) variability. Methodologically, it marries the careful analysis of dialect phenomena in authentic, naturalistic texts to aggregational-dialectometrical techniques. Two research questions guide the investigation: First, on methodological grounds, is corpus-based dialectometry viable at all? Second, to what extent is morphosyntactic variation in non-standard British dialects patterned geographically? By way of validation, findings will be matched against previous work on the dialect geography of Great Britain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Intonational Variation in the British Isles

Models of intonation are typically based on one dialect and one style and do not account for interor intra-speaker variability. Speech data from the IViE corpus, however, demonstrate considerable variation in English intonation that occurs both across and within dialects (IViE = Intonational Variation in English, UK ESRC award R000237145, http://www.phon.ox.ac.k/~esther/ivyweb). In this paper, ...

متن کامل

Dialect analysis and modeling for automatic classification

In this paper, we present our recent work in the analysis and modeling of speech under dialect. Dialect and accent significantly influence automatic speech recognition performance, and therefore it is critical to detect and classify non-native speech. In this study, we consider three areas that include: (i) prosodic structure (normalized f0, syllable rate, and sentence duration), (ii) phoneme a...

متن کامل

LSTM Autoencoders for Dialect Analysis

Computational approaches for dialectometry employed Levenshtein distance to compute an aggregate similarity between two dialects belonging to a single language group. In this paper, we apply a sequence-to-sequence autoencoder to learn a deep representation for words that can be used for meaningful comparison across dialects. In contrast to the alignment-based methods, our method does not requir...

متن کامل

Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners

Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...

متن کامل

Towards a typological classification and description of HRTs in a multidialectal corpus of contemporary English

This paper investigates some of the phonetic characteristics of the High Rising Terminal (HRT), a major intonational innovation now attested in numerous dialects of English worldwide. Based on a corpus containing recordings of different geographical varieties of contemporary English, it presents an inventory of the intonation patterns used to realize the HRT. It also suggests that late rising c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJHAC

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2008